AITopics | shape parameter

Collaborating Authors

shape parameter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive RBF-KAN: A Comparative Evaluation of Dynamic Shape Parameters in Kolmogorov-Arnold Networks

Cavoretto, Roberto, De Rossi, Alessandra, Haider, Adeeba, Noorizadegan, Amir

arXiv.org Machine LearningMay-22-2026

Kolmogorov-Arnold Networks (KANs) approximate multivariate functions using learnable univariate edge functions, typically parameterized by B-spline bases. Although effective, spline-based implementations can be computationally expensive. A modified version of KANs, called FastKAN, improves efficiency by replacing splines with Gaussian radial basis functions (RBFs), but it relies on a fixed kernel and shape parameter. In this work, we extend the RBF-based KAN framework by introducing a broader family of radial basis kernels and by initializing the kernel shape parameter using leave-one-out cross-validation (LOOCV). To the best of our knowledge, this is the first study that integrates LOOCV-based kernel scale estimation with deep KAN training. We also introduce Matérn and Wendland kernels into the KAN framework for the first time, enabling more flexible basis representations beyond the Gaussian kernel used in FastKAN. The LOOCV estimate provides a data-driven initialization of the kernel scale, which is subsequently refined during network training. The proposed adaptive RBF-KAN is evaluated on several two-dimensional benchmark functions. The results highlight the importance of kernel selection and adaptive shape parameters, with different kernels showing advantages for smooth functions, discontinuities, and oscillatory patterns. Overall, combining LOOCV-based initialization with adaptive kernel learning provides a practical strategy for improving RBF-based KAN models.

artificial intelligence, kan, machine learning, (18 more...)

arXiv.org Machine Learning

2605.21534

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Supplementary Material for DreamHuman: Animatable 3DAvatars from Text

Neural Information Processing SystemsApr-25-2026, 20:31:06 GMT

This document contains additional details and experiments that did not fit in the main text due to space constraints. For animations and additional results please also check the included videos. We use a similar optimization strategy with DreamFusion, so unless otherwise noted the hyperparameters remain the same. For example, we use the Distributed Shampoo optimizer [2]. Similarly with DreamFusion we also train on a TPUv4 machine with 4 chips.

artificial intelligence, geometry, shape parameter, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.70)

Add feedback

Empowering Convolutional Neural Networks with MetaSin Activation

Neural Information Processing SystemsApr-25-2026, 17:32:41 GMT

RELU networks have remained the default choice for models in the area of image prediction despite their well-established spectral bias towards learning low frequencies faster, and consequently their difficulty of reproducing high frequency visual details. As an alternative, sinnetworks showed promising results in learning implicit representations of visual data. However training these networks in practically relevant settings proved to be difficult, requiring careful initialization, dealing with issues due to inconsistent gradients, and a degeneracy in local minima. In this work, we instead propose replacing a baseline network's existing activations with a novel ensemble function with trainable parameters. The proposed METASIN activation can be trained reliably without requiring intricate initialization schemes, and results in consistently lower test loss compared to alternatives. We demonstrate our method in the areas of Monte-Carlo denoising and image resampling where we set new state-of-the-art through a knowledge distillation based training procedure. We present ablations on hyper-parameter settings, comparisons with alternative activation function formulations, and discuss the use of our method in other domains, such as image classification.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada > Ontario (0.28)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits

Neural Information Processing SystemsFeb-9-2026, 02:49:37 GMT

We consider the problem of regret minimization in non-parametric stochastic bandits. When the rewards are known to be bounded from above, there exists asymptotically optimal algorithms, with asymptotic regret depending on an infi-mum of Kullback-Leibler divergences (KL).

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Empowering Convolutional Neural Networks with MetaSin Activation

Neural Information Processing SystemsFeb-8-2026, 17:54:35 GMT

This work was done while the author was an intern at Disney Research | Studios 37th Conference on Neural Information Processing Systems (NeurIPS 2023).

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BackSlash: Rate Constrained Optimized Training of Large Language Models

Wu, Jun, Wen, Jiangtao, Han, Yuxing

arXiv.org Artificial IntelligenceNov-19-2025

The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored. In this work, we introduce Rate-Constrained Training (BackSlash), a novel training-time compression approach based on rate-distortion optimization (RDO). BackSlash enables a flexible trade-off between model accuracy and complexity, significantly reducing parameter redundancy while preserving performance. Experiments in various architectures and tasks demonstrate that BackSlash can reduce memory usage by 60% - 90% without accuracy loss and provides significant compression gain compared to compression after training. Moreover, BackSlash proves to be highly versatile: it enhances generalization with small Lagrange multipliers, improves model robustness to pruning (maintaining accuracy even at 80% pruning rates), and enables network simplification for accelerated inference on edge devices.

backslash, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.16968

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits

Neural Information Processing SystemsOct-8-2025, 07:27:43 GMT

algorithm, bandit problem, imed, (16 more...)

Neural Information Processing Systems

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 05:11:49 GMT

The authors discuss how the problems can be formulated as optimization of objective functions defined on the subgraphs. A straightforward search over the subgraphs is computationally infeasible, so the authors present a highly novel approach that leads to computationally efficient tests. The paper includes proofs that the tests are nearly minimax optimal for the exponential family of distributions and graphs satisfying the polynomial growth property. The paper concludes with an analysis of synthetic and real datasets. Strengths: (1) The paper addresses a problem of growing importance and presents novel approaches for statistical tests.

constraint, graph, subgraph, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre:

Summary/Review (1.00)
Research Report > Promising Solution (0.55)
Overview > Innovation (0.55)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)

Add feedback

The Poisson Gamma Belief Network

Mingyuan Zhou, Yulai Cong, Bo Chen

Neural Information Processing SystemsOct-2-2025, 16:11:39 GMT

To infer a multilayer representation of high-dimensional count vectors, we propose the Poisson gamma belief network (PGBN) that factorizes each of its layers into the product of a connection weight matrix and the nonnegative real hidden units of the next layer. The PGBN's hidden layers are jointly trained with an upward-downward Gibbs sampler, each iteration of which upward samples Dirichlet distributed connection weight vectors starting from the first layer (bottom data layer), and then downward samples gamma distributed hidden units starting from the top hidden layer. The gamma-negative binomial process combined with a layer-wise training strategy allows the PGBN to infer the width of each layer given a fixed budget on the width of the first layer. The PGBN with a single hidden layer reduces to Poisson factor analysis. Example results on text analysis illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the PGBN, whose hidden units are imposed with correlated gamma priors, can add more layers to increase its performance gains over Poisson factor analysis, given the same limit on the width of the first layer.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country: